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NUMERICAL METHODS FOR A CLASS OF NONLOCAL 
DIFFUSION PROBLEMS WITH THE USE OF BACKWARD SDES 

G. ZHANG 1 * * * § , W. ZHAO*, C. G. WEBSTER 1 *, AND M. GUNZBURGER§ 


Abstract. We propose a novel numerical approach for nonlocal diffusion equations [8] with 
integrable kernels, based on the relationship between the backward Kolmogorov equation and back¬ 
ward stochastic differential equations (BSDEs) driven by Levy processes with jumps. The nonlocal 
diffusion problem under consideration is converted to a BSDE, for which numerical schemes are de¬ 
veloped and applied directly. As a stochastic approach, the proposed method does not require the 
solution of linear systems, which allows for embarrassingly parallel implementations and also enables 
adaptive approximation techniques to be incorporated in a straightforward fashion. Moreover, our 
method is more accurate than classic stochastic approaches due to the use of high-order temporal 
and spatial discretization schemes. In addition, our approach can handle a broad class of problems 
with general nonlinear forcing terms as long as they are globally Lipchitz continuous. Rigorous error 
analysis of the new method is provided as several numerical examples that illustrate the effectiveness 
and efficiency of the proposed approach. 

Key words, backward stochastic differential equation with jumps, nonlocal diffusion equations, 
superdiffusion, compound Poisson process, 0-scheme, adaptive approximation 


1. Introduction. A diffusion process is deemed nonlocal when the associated 
underlying Levy process does not only consist of Brownian motions. The features 
of nonlocal diffusion has been verified experimentally to be present in a wide vari¬ 
ety of applications, including, e.g., contaminant flow in groundwater, plasma physics, 
and the dynamics of financial markets. A comprehensive survey of nonlocal diffu¬ 
sion problems is given in [15]. In this work, we consider a partial-integral diffusion 
equation (PIDE) representation [6, 11, 8] of linear nonlocal diffusion problems. Since 
it is typically difficult to obtain analytical solutions of such problems, numerical so¬ 
lutions are highly desired in practical applications. There are mainly two types of 
numerical methods for nonlocal diffusion equations under consideration. The first is 
the family of deterministic approaches, e.g., finite-element-type methods [6, 11] and 
collocation methods, whereas the second can be classified as stochastic approaches, 
e.g., continuous-time random walk (CTRW) methods [5, 10, 15]. 

The deterministic approaches are extensions of existing methods for local partial 
differential equations (PDEs) that incorporate schemes to discretize the underlying 
integral operators. The recently developed nonlocal vector calculus [9] provides help¬ 
ful tools that allow one to analyze nonlocal diffusion problems in a similar manner 
to analyzing local PDEs; see [8, 11] for details. However, in the context of implicit 
time-stepping schemes, the nonlocal operator may result in severe computational diffi¬ 
culties coming from the dramatic deterioration of the sparsity of the stiffness matrices 
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required by the underlying linear systems. On the other hand, stochastic approaches, 
e.g., CTRW methods, are based on the relation between nonlocal diffusion and a class 
of Levy jump processes. Although stochastic methods do not require the solution of 
linear systems, and the simulations of all paths of the random walk can be easily 
parallelized, they also have some drawbacks. Typically, most stochastic methods are 
sampling-based Monte Carlo approaches, suffering from slow convergence, thus re¬ 
quiring a very large number of samples to achieve small errors. Even worse, in the 
case of a general nonlinear forcing term, the nonlocal diffusion equation is no longer 
the master equation of the underlying jump processes. 

In this paper, we propose a novel stochastic numerical scheme for the linear 
nonlocal diffusion problems studied in [5, 6, 10, 9, 11, 8] based on the relationship 
between the PIDEs and a certain class of backward stochastic differential equations 
(BSDEs) with jumps. The existence and uniqueness of solutions for nonlinear BSDEs 
with and without jumps have been proved in [17] and [1], respectively. Since then, 
BSDEs have become important tools in probability theory, stochastic optimal control 
and mathematical finance [7]. Unlike BSDEs driven by Brownian motions, there are 
very few numerical schemes proposed for BSDEs with jumps, and most of the schemes 
are focussed on time discretizations only. For example, a numerical scheme based on 
Picard’s method was proposed in [14], and a forward Euler scheme was proposed in 
[2, 3] where the convergence rate was proved to be (Af)5. 

Our focus will be on the BSDEs corresponding to nonlocal diffusion equations 
with integrable jump kernels in unbounded domains. Since the PIDEs of interest do 
not possess local convection and diffusion terms, the corresponding BSDEs can be 
simplified, such that, the underlying Levy processes are merely compound Poisson 
processes. Also, motivated by various engineering applications discussed in [8], our 
goal is to develop and analyze computationally efficient numerical schemes that can 
achieve high-order accuracy in both temporal and spatial discretization, rather than 
focus on the scalability of our schemes in solving extremely high-dimensional problems 
(e.g., in option pricing). From this perspective, although the BSDEs under considera¬ 
tion is a special case of the equations studied in [2], high-order fully-discrete schemes 
and the corresponding error analysis are still missing in the literature. In fact, based 
on our previous work, it is particularly challenging to recover the classic finite element 
convergence rates in the spatial discretization, even for the simplified BSDEs. Thus, 
the main contributions of this paper are summarized as follows: 

• Development of stable higlr-order matrix-free numerical schemes for the BS¬ 
DEs and the corresponding nonlocal diffusion problems; 

• Analysis of the approximation error of the proposed semi-discrete and fully- 
discrete schemes, and; 

• Numerical demonstration of the capabilities of handling multiple types of 
integrable kernels, e.g., symmetric, non-symmetric, or singular kernels. 

Specifically, the BSDEs will be discretized, in the temporal domain, using a 9- 
scheme extended from our previous works [21, 20] for BSDEs without jumps. In 
particular, the cases where 6 = 0, 9 = ^ and 9=1 correspond to forward Euler, 
Crank-Nicolson and backward Euler schemes, respectively. As discussed in [21], a 
quadrature rule adapted to the underlying stochastic processes is critical to approx¬ 
imate all the conditional mathematical expectations in our numerical scheme. Thus, 
we propose a general formulation of the quadrature rule for estimating the conditional 
expectations with respect to the compound Poisson processes, and a specific form of 
a certain PIDE can be determined based on regularities of the kernel and forcing 
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term. In §5, Gauss-Legendre, Gauss-Jacobi and Newton-Cotes rules are substituted 
into the proposed quadrature rule to approximate different kernels and forcing terms. 
In addition, since the total number of quadrature points grows exponentially with 
the number of time steps, we construct a piecewise Lagrange interpolation scheme 
based on a pre-determined spatial mesh, in order to evaluate the integrand of the 
expectations at all quadrature points. 

Both theoretical analysis and numerical experiments show that our approach is 
advantageous for linear nonlocal diffusion equations with integrable kernels on un¬ 
bounded domains. Compared to deterministic approaches, e.g., finite elements and 
collocation approaches, the proposed methods do not require the solution of dense lin¬ 
ear systems. Instead, the PIDE can be solved independently at different spatial grid 
points on each time level, making it straightforward to incorporate parallel imple¬ 
mentation and adaptive spatial approximation. Compared to stochastic approaches, 
our scheme is more accurate than the CTRW method due to the use of the 0-scheme 
for time discretization, the high-order quadrature rule, and piecewise Lagrange poly¬ 
nomial interpolation for spatial discretization. In addition, our method can handle 
a broad class of problems with general nonlinear forcing terms as long as they are 
globally Lipchitz continuous. 

The outline of the paper is as follows. In §2, we introduce the mathematical 
description of the linear nonlocal diffusion equations and the corresponding class of 
BSDEs under consideration. In §3, we propose our numerical schemes for the BSDEs 
of interest, where the senri-discrete and the fully-discrete approximations are presented 
in §3.1 and §3.2, respectively. Error estimates for the proposed scheme are proved in 
§4. Extensive numerical examples and comparisons to existing techniques are given 
in §5, which are shown to be consistent with the theoretical results and reveal the 
effectiveness of our approach. Finally, several concluding remarks are given in § 6 . 

2. Nonlocal diffusion models and BSDEs with jumps. This section is ded¬ 
icated to describing the nonlocal diffusion problem that is the focus of this paper. In 
particular, in § 2.1 we introduce the definitions of a general nonlocal operator and dif¬ 
fusion equation. The relationship between the backward Kolmogorov equation, which 
is a generalization of the nonlocal diffusion equation of interest, and the BSDE with 
jumps, is described in §2.2. Based on this equivalence, a simplified BSDE correspond¬ 
ing to the nonlocal diffusion problem of § 2.1 is also given at the end of § 2 . 2 . 

2.1. Nonlocal diffusion equations. Let us recall a nonlocal operator intro¬ 
duced in [8]. For a function u(t, x) : [0, T] x 1 J -) R, with d = 1, 2,3 and T > 0, we 
define the action of the linear operator £ on u(t, x) as 

£u = (u(t, x + e) — u(t, x)) 7 (e) de, V(t, x) € [0, T] x R d , (2.1) 

jR d 

where the properties of £ depend crucially on the kernel function 7 (e) : R d —x R. In 
this work, we focus on a particular class of kernel functions which are nonnegative 
and integrable, i.e., 

7 (e) >0 Ve € R d and / 7 (e) de < 00 . (2.2) 

JR d 

Note that 7 (e) may be symmetric , i.e., 7 (e) = 7 (—e) for any e G R d , or non- 
symmetric, i.e., there exists e G R d such that 7 (e) 7 ^ 7 (—e). The operator £ exhibits 
nonlocal behavior because, for a fixed t G [0, T], the value of £u at a point x = 
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(x 1 ,..., x d ) G R d requires information about u at points x + e ^ x; this is contrasted 
with local operators, e.g., the value of A u at a point x requires information about u 
only in an infinitesimal neighborhood of x. Our interest in the operator C is due to 
its participation in the nonlocal diffusion equation 


— (■ t , x) — Cu(t, x) = g(t, x , u) for (t, x) G (0, T] x 

u( 0, x) = uq{x), for x G 


(2.3) 


where uq{x) is the initial condition and the forcing term g(t, x, u) may be a nonlinear 
function of t,x and u. We remark that (2.3) is a nonlocal Cauchy problem defined 
on the unbounded spatial domain Also, in the context of kernel functions 7 (e) 
that are compactly supported in the initial-boundary value nonlocal problem with 
volume constraints, i.e., constraints applied on a regions of nonzero measure in 
has been well studied [ 6 , 8 , 11]. However, the well-posedness of the corresponding 
BSDEs with volume constraints is still an open challenge. As such, we do not impose 
volume constraints to (2.3) in this effort. 


2.2. BSDEs and backward Kolmogorov equations. Now we discuss the 
probabilistic representation of the solution of the nonlocal diffusion equation in (2.3). 
The nonlinear Feynman-Kac formula studied in [16] shows that BSDEs driven by 
Brownian motion provides a probabilistic representation of the solutions of a class 
of second-order quasi-linear parabolic PDEs. This result was extended in [1] to the 
case of BSDEs driven by general Levy processes with jumps. Due to the inclusion 
of Levy jumps, the counterpart of a BSDE is a partial integral differential equation 
(PIDE), i.e., the backward Kolmogorov equation [13]. It turns out that such a PIDE 
is a generalization of the nonlocal diffusion equation in (2.3). Here we first introduce 
a general form of a BSDE with jumps and the backward Kolmogorov equation, then 
give a simplified form of the BSDE corresponding to the nonlocal diffusion equation 
(2.3) under consideration. 

Let (fbJ 7 , P) be a probability space with a filtration {E t }o<t<T for a finite 
terminal time T > 0. We assume the filtration {iFt}o<t<T satisfies the usual hy¬ 
potheses of completeness, i.e., Eo contains all sets of P-measure zero and maintains 
right continuity, i.e., T t = E t+ . Moreover, the filtration is assumed to be gener¬ 
ated by two mutually independent processes, i.e., one m-dimensional Brownian mo¬ 
tion Wt = {Wf ,..., W t m ) T and one d-dimensional Poisson random measure /i(A, t) 
on E x [0, T], where E = M d \{0} is equipped with its Borel field £. The com¬ 
pensator of g and the resulting compensated Poisson random measure are denoted 
by v(de,dt) = X(de)dt and f(de,dt) = g(de,dt) — X(de)dt respectively, such that 
{fi(A x [0, t]) = (g — v){Ax [0, t])}o<t<T is a martingale for all A G £. We also assume 
that X(de) is a tr-finite measure on (E,£) satisfying 


(1 A |e| 2 )A(cie) < + 00 . 


(2.4) 


Based on the stochastic basis (fi, T, {-P}o<t<T, P), we introduce the backward stochas- 
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tic differential equation with jumps 


X t = X 0 + [ b(s,X s )ds + I cr(s,X s )dW s + f ( p(s,X s _,e)jl(de,ds) 
Jo Jo Jo JE 


Y t = £+ f(s,X s ,Y a ,Z s ,T s )ds— / Z s dW s - 


U s (e)fJ(de,ds), 


it JE 


(2.5) 

where the solution is the quadruplet (X t , Y t , Z t , U t ) with X t G R d , Y t G R 9 , Z t G 
R 9Xm and Ut G R 9 . The map b : [0, T] x R d —> R d is referred to as the drift coefficient, 
<x : [0, T] x R d —>• R d x R m is the local diffusion coefficient, P : [0, T] x R d x R d —» R d 
is the jump coefficient, f : [0,T] x R m x R d x R dxm x R d —> R 9 is the generator of 
the BSDE, and the processes T s : [0, T] —> R 9 is defined by T s = f E U s (e)r/(e)X(de) 
for a given bounded function rj : R d —> R. The terminal condition £, which is an 
.F-r-measurable random vector, is assumed to be a function of Xt, denoted by £ = 
c p(Xt )• Note that the integrals in (2.5) with respect to the m-dimensional Brownian 
motion W t and the d-dimensional compensated Poisson measure jl(de, dt) are Ito-type 
stochastic integrals. The first equation in (2.5) is the forward SDE system, which is 
a general Levy process, while the second equation is a system of BSDEs driven by 
X t . The quadruplet (X t , Y t , Z t , U t ) is called an L 2 -adapted solution if it is {T t }- 
adapted, square integrable process which satisfies the BSDE in (2.5). Under standard 
assumptions on the given functions b , er, /, ip and P, the existence and uniqueness 
of the solution of the system in (2.5) with a nonlinear generator / have been proved 
in [1], 

Based on the extension of the nonlinear Feynman-Kac formula for BSDEs pro¬ 
posed in [1], the adapted solution (X t , Y t ,Z t , U t ) of (2.5) can be associated to the 
unique viscosity solution v(t,x ) G C([0,T] x R d ) of the backward Kolmogorov equa¬ 
tion, i.e., 


— {t,x) + Cv(t,,x) + f(t,x,v,crX7v,Bv) = 0, for ( t,x) G [0,T) x R d 
v(T,x) = <p(x), for x G R d . 


In (2.6), p>(x) is the terminal condition at the time t = T, and the second-order 
integral-differential operator C is of the form 

~ d dv 1 d d 2 v 

Cv(t, x) = J2 bi(t, x) + -J2 x)T-^-{t, x) 

(2.7) 


i,j —1 


+ J ^v(t,x +P(t,x,e)) ~v{t.,x)%t,x,e)^j X{d e ), 


with B is an integral operator defined as 


Bv(t,x) = / [v(t,x + P(t, x,e)) — v(t,x)]r](e)X(de). 

J E 

The functions b, cr , /, tp, P in (2.7) have the same definitions as in the BSDE in (2.5). 
Under the condition that X s = x for a fixed s G [0,T), the solution (Y t , Z t ,U t ) of 
the BSDE for s < t < T can be represented by 

r Y t = v (t,x t ), 
l Z t = *(t,X t )Vv(t,X t ), 

{ U t = v(t,X t - + P(t,X t -,e)) - v(t,X t -), 


( 2 . 8 ) 
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where Vu denotes the gradient of v with respect to X t and T t = Bv(t, X t ). By com¬ 
paring the operator C in (2.7) and the BSDE in (2.5), we can see that the incremental 
dX t consists of three components: the drift term b(t, X t )dt, the Brownian diffusion 
<x(f, X t )dWt , and the jump diffusion f E (3{t, X t -, e)p{de, dt). 

Now let us relate the BSDE in (2.5) to the nonlocal diffusion equation in (2.3). 
Without loss of generality, we only consider the case when Y t is a scalar function, i.e., 
by setting q = 1 in (2.5); the numerical schemes and analysis presented in this paper 
can be directly extended to the case of q > 1. Observing that the operator C in (2.7) 
is a generalization of the nonlocal operator C in (2.3), we aim to find a simplified form 
of the BSDE which is the stochastic representation of the nonlocal diffusion equation 
in (2.3). To proceed, we define A(de) = 7 (e)de for e G E and A (de) = 0 for e ^ E, 
satisfying the condition in (2.4). Due to the integrability assumption of 7 (e) in (2.2), 
the jump diffusion term in (2.5) is a compensated compound Poisson process. In this 
case, the compensated Poisson random measure p{de , dt) can be represented by 

p(de , dt) = p{de, dt) — Xp(e)dedt, (2.9) 

where A (de) = A p(e)de, with A being the intensity of Poisson jumps and p(e)de being 
the probability measure of the jump size f3(t,x,e) satisfying f E p(e)de = 1. Then, 
the operator C in (2.7) can be simplified to the nonlocal operator C in (2.1) by setting 
to = 0, i.e., removing the Brownian motion, and 

<t = 0, /3 = el D (e), A= f 'y(e)de, 

j , JE (2-10) 

p{e) = - 7 ( e )> h = / ej(e)de for i = 1 ,..., d, 

A Je 

where 2£>(e) is the characteristic function of the domain D. In the context of /3 = 
elo(e), the drift coefficient b is a constant vector which indeed helps cancel the first- 
order derivatives of u in C. On the other hand, substituting 7 (e) into the definition 
of b , we have 


/ bi ds = X / ep(e)deds = AtE[e] for * = 1 ,..., d, 
Jo Jo Je 


( 2 . 11 ) 


which is the compensator of X t . Hence, X t is just a compound Poisson process under 
the Poisson measure p(de,dt) without compensation. Then, by defining f(t,x,u) = 
g(T — t,x,u) and ip(x) = Uo(x), with g and uq being the forcing term and initial 
condition in (2.3) respectively, we obtain the BSDE corresponding to the nonlocal 
diffusion equation in (2.3) 


X t =X, 


n e p(de , ds), 

!/ 

Y t =tp(X T )+[ f(s,X s ,Y s )ds- f f U s (e) fl(de,ds). 
J t J t J E 


( 2 . 12 ) 


According to Theorem 2.1 in [1], the well-posedness of the BSDE in (2.12) requires 
the generator f is globally Lipschitz continuous, i.e., there exists K > 0 such that 


\f{t,x,y) - f(t,x',y')\ < K(\x - x'\ + \y-y’\) 
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for all 0 < t < T with x,x' £ R d and y,y' £ R. In this case, there exists a unique 
solution ( Y t ,U t ) £ S 2 x L 2 (p) where S 2 is the set of J't-adapted cadlag processes 
[Y t , 0 < t < T} such that 


r*ii| a =E 


( sup 

\0<t<T 



2 


< 00 , 


and L 2 (jl) is the set of mappings U : 12 x [0, T] x E — > R such that 


I U t 


l=(A) 


= E 


U t (e) 2 X(de)dt 


< oo. 


To relate the solution ( Y t ,Ut ) to the viscosity solution u of the nonlocal diffusion 
equation in (2.3), we assume ip and / are continuous functions satisfying 

\f(t, sc, 0)| < C(1 + |£c| p ) and \<p(x)\ < C(1 + |a;| p ), (t, x) £ [0, T] x R d , 

for some real C, p > 0. Then, under the condition that X s = x for a fixed s £ [0, T), 
the solution (Y t ,U t ) for s < t < T can be represented by 

Y t = u(T — t, X t ) and U t = u(T - t, X t _ + e) - u(T - t, X t _), 

where u £ C([0,T] x R d ) is the viscosity solution of (2.3) satisfying 

| u(t, x)\ < C{ 1 + |£c| p ), (t, x) £ [0, T] x R d , 


again, for some real C,p > 0. Moreover, if ip and / are bounded and uniformly 
continuous, then u is also bounded and uniformly continuous. 

Now we introduce the following notations which will be used throughout. Let Xl' x 
(t < s < T) be a cr-field generated by the stochastic process {x + X r — X tl t < r < s} 
starting from the time-space point {t,x), and set F t,x = Tj?. Denote by E[-] 
the mathematical expectation and E* ,x [ • ] the conditional mathematical expectation 
under the cr-field Tl’ x (t < s < T), i.e., E‘ ,a:; [-] = E[- IJ 7 *’®]. Particularly, for the sake 
of notational convenience, when s = t, we use E*[-] to denote E[- (Jq’*], i.e., the 
mathematical expectation under the condition that X t = x. 

3. Numerical schemes for BSDEs with jumps. In this section we propose 
numerical schemes for the BSDE in (2.12) in order to solve the nonlocal diffusion 
equation in (2.3). Specifically, a semi-discrete scheme for time discretization is studied 
in §3.1, and a fully-discrete scheme is constructed in §3.2 by incorporating appropriate 
spatial discretization techniques. 

3.1. The semi-discrete scheme. For the time interval [0, T], we introduce the 
partition 


% — {0 — <o < ’ ’ ’ < tN — T} 


(3.1) 


with At n = f n +i — t n and r = maxo< n <jv_i Af n . In the time interval [t n ,t n + 1 ] for 
0 < n < IV — 1, under the condition that X tn = x £ R d , the BSDE in (2.12) can be 
rewritten as 


e p(de,dr ) for t n < s < t n +\, 


X s — x + 


(3.2) 
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f(s,X s ,Y s )ds- 


U s {e)jl(de , ds). 


(3-3) 


IE 


Taking the conditional mathematical expectation E* [•] on both sides of (3.3), due 
to the fact that f* f E U a (e)p,(de, ds) for t > t n is a martingale, we obtain that 


Y tn =El[Y tn+1 } + 


E f[f(s,X a ,Y s )]ds, 


(3.4) 


where E£jY t J = Y tn (x) = u(T -t n ,x), Y tn+l = Y t „ +1 ( x t„+i) = u(T - t n+1 , X tn+1 ) 
and the integrand E f n [f(s,X s ,Y s )] is a deterministic function of s € [t n ,t n + 1 ]. To 
estimate the integral in (3.4), several numerical methods for approximating integrals 
with deterministic integrands can be used. Here, we utilize the 0-scheme proposed in 
[20, 21], which yields 


y tri =K n [Y tn+1 ] + (i-e)At. 

+ 9At n f(t n , X t 


El[f(t n+1 ,X tn+1 ,Y tr 


,Y t J + R n 


(3.5) 


where 9 e [0,1] and the residual R n is given by 

rtn+1 . 

Rn = J {e llf( S ,X s ,Y s )]-6f(t n ,X tn ,Y tn ) 

- (1 - mi [/(tn+1 - *t„ +1 . Y tn+1 )] }ds. 


(3.6) 


Next we propose a semi-discrete scheme for the BSDE in (2.12) as follows: given the 
random variable Y/v = Y N = p(Xj) as the terminal condition, for n = N—l ,..., 1, 0, 
under the condition that X n = x , the solution Y tn is approximated by Y n satisfying 


{ -^n+1 = X n + J e /j,(de, At), 

Y n = E* [Y n+ 1] + (1 - 9)At n Ef n [f(t n+1 ,X n+1 ,Y n+1 )] (3 ' 7) 

+ 0At n f(t n , X n , Y n ), 

where 0 < 9 < 1. By choosing different values for 9, we obtain different semi-discrete 
schemes, e.g., 9 = 0, 9 = 1 and 9 = ^ lead to forward Euler (FE), backward Euler 
(BE) and Crank-Nicolson (CN) schemes, respectively. 

Note that since X t is the standard compound Poisson process, the probability 
distributions of X t or any incremental X t — X f j , with 0 < t' < t < T, are well known. 
Thus, one does not need to discretize X t so that, in (3.7), X n+l denotes the exact 
solution evaluated at t n+ \. In this case, R n in (3.6) is the local truncation error of 
the scheme (3.7); a rigorous error analysis of R n is given in §4. Other time-stepping 
schemes, e.g., linear multi-step schemes [22], can be directly used in order to further 
improve the accuracy of time discretization. Note that in the general case where the 
coefficients b and cr in (2.5) and (2.7) are functions of t and X t , another time-stepping 
scheme [18] is also needed for to discretize X t , so that an extra local truncation error 
will be introduced into the senri-discrete scheme (3.7). This is beyond the scope of 
this effort but will be the focus of future work. 
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3.2. The fully-discrete scheme. The semi-discrete scheme (3.7) is not compu¬ 
tationally feasible, because the involved conditional mathematical expectation Ef [•] 
is defined over the whole real space R d . Thus, an effective spatial discretization ap¬ 
proach is also necessary in order to approximate Ef [ • ]. To proceed, we partition the 
d-dimensional Euclidean space R d by 

S d h =SlxS 2 h 2 x...xSt d , 


where S^ k for k = 1,.., d is a partition of the one-dimensional real space R, i.e., 


c k 

^h k - 


<Xi x 1 - € R,* £ Z, x’l < x!- +1 , lim x\ = +oo, lim x\ = — oo}, 

l i—>•+oo i—> — oo J 


such that hk = maxi£z{\xf — a;*_ 1 1} and h = maxi <k<d,hk- For each multi-index 
i = (ii, * 2 ,..., id,) G the corresponding grid point in Sf t is denoted by x\ = 

K,-••,0- 


Now we turn to constructing a quadrature rule for approximating E^j [ • ] in (3.7) 
for (t n ,x i) G T t x Sf t . Such expectation is defined with respect to the probability 
distribution of the incremental stochastic process AX n+ i = X n+ i — X n = X n+ i — x\, 
which is a compound Poisson process starting from the grid point (t n ,Xi). Here 
we take E^ 1 [K„yi] as an example and the proposed quadrature rule can be directly 
extended to approximate E*‘ [/(t n+1 , X n+ \, y n+1 )]. It is well known that the number 
of jumps of X t within (t n ,t n + 1 ], denoted by N t , follows a Poisson distribution with 
intensity 0 < A < + 00 , and the size of each Poisson jump follows the distribution 
p(e)de defined in (2.10). Thus, E^[Y n+ i] can be represented by 


m =0 
00 


^k 


E Z |7»+i] = E P {^tn+1 - N t „ = to} E Y n+1 *i + E 

x k =1 
/ m 

^n+1 ( ^ ^ 


= ^ exp(—AAf n ) E 


m =0 


ml 


= exp(-AA t n ) Y n+ i(*i) + Y exp(-AA t n ) 


k =1 

(AA t 


7.! 


m —1 


/E J E 


^n+1 ^ ^ ^k \ ( 11 Pi^k) j • • • d&fn, 


k=l 7 x k =1 


(3.8) 


where = {e\,. .. ,e d ) for k = 1 ,...,m is the size of the fc-th jump. Observing 
that the probability of having m jumps within (t n ,t n+ 1 ] is of order 0((\At n ) m ), the 
conditional mathematical expectation E* 4 [T n+1 ] can be approximated by a trunca¬ 
tion of (3.8), i.e., the sum of a finite sequence, where the number of terms retained 
is determined according to the prescribed accuracy. For example, if we take 9 = A 
in (3.7), we expect a second-order convergence from the time discretization, i.e., the 
local truncation error R n in (3.5) should be of order 0((At n ) 3 ). In this case, assum¬ 
ing the intensity value A is of order 0(1), the first three terms should be retained in 
(3.8) in order to guarantee the error from the truncation of (3.8) matches the order of 
R n . Hence, in the sequel, we denote by E*‘ M [T n +i] the approximation of E^[T„ + i] 
by retaining the first M y + 1 terms, where M y indicates the number of jumps in¬ 
cluded in E^ 1 M [Y n . |_i]. An analogous notation E^J M ,[f n + 1 ] is used to represent the 
approximation of E Zlfn+i] by retaining Mf jumps. 
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Next, we also need to approximate the m x d-dimensional integral in (3.8) for 
m = 1 This can be accomplished by selecting an appropriate quadrature 

rule base on the properties of p{e) and the smoothness of Y n+ i. A straightforward 
choice is to utilize Monte Carlo methods by drawing samples from IIl-Li P( e k), but 
this is inefficient because of slow convergence. To enhance the convergence rate, an 
alternative way is to use the tensor product of high-order one-dimensional quadrature 
rules, e.g., Newton-Cotes, Gaussian, etc.. In addition, for multi-dimensional problems, 
i.e., d > 1, by requiring second-order convergence 0((At) 2 ), i.e., M y = 2, sparse- 
grid quadrature rules [4, 12] can be applied to further improve the computational 
efficiency. Without loss of generality, for m = 1 we denote by {w™}^ 

and {a y ’ m ,..., the set of weights and points, respectively, of the selected 

quadrature rule for estimating the m-th integral in (3.8). Then, the approximation 
of E£[r„+i], denoted by My [Y n+1 \ is given by 

^T n ,M y [ y "+i] = exp(-AAf) Y n+1 ( Xi ) 

+ £ exp(-A4 t )<^A E<r„ +1 (x, + £«?”)■ <3 ' 9 ’ 

m —1 9—1 k —1 

Note that it is possible that the quadrature points {aq + Y^k=i i n (3-9) 

do not belong to the spatial grid Sf r The simplest approach to manage this difficult 
is to add all quadrature points to the spatial grid at each time stage. However, this 
will result in an exponential growth of the total number of grid points as the number 
of time steps increases. Thus, we follow the same strategy as in [20, 21, 22], and 
construct piecewise Lagrange interpolating polynomials based on Sh to approximate 
Y n+ i at the non-grid points. Specifically, at any given point x = (x 1 ,. .. ,x d ) € R d , 
Y n +i(x) is approximated by 


1 n +1 (* e ) 


P+1 

Y n +i,p(x) = y ] • 

ji=i 



1 n+l,p 


n n 

fc=l 1<3<P+1 


where Y n+ i, p {x) is a p-th order tensor-product Lagrange interpolating polynomial. 
For k = 1,... ,d, the interpolation points {xf c are the closest p + 1 neigh¬ 
boring points of x k such that (ar^ ,..., x d . ), for jk = 1,.. • ,p +1 and k = 1,..., d, con¬ 
stitute a local tensor-product sub-grid around x. The fully-discrete solution at the in¬ 
terpolation point (xj .^,..., xf. ) € S d is given by Y^[' p ' hd . Note that Y n+ltP (x) does 

not interpolate the semi-discrete solution Y n+1 (x) due to the error Y n+ i(x{) — Y^ +l p . 
Therefore, the fully-discrete scheme of the BSDE in (2.12) is described as follows: 
given the random variable Yn(x i) for i £ Z d , and for n = N — 1,..., 0, solve the 
quantities Y' l p for i € Z d such that 


x n . 1 — X[ 


e p{de, At), 


Yn,p P'ra+qp] + (1 0)At n E t ^Mf [f {^n+1 1 -^ra+1 j ^ra+l,p)] 

+ 9At n f (t n , X n , Y^ p ) , 


(3.10) 


where 9 £ [0,1] and the non-negative integers M y , Mf indicate the Poisson jumps 
included in the approximations of E®‘ [Y n+ i tP \ and E**[/], respectively. 
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Note that, by using the quadrature rule, the approximation in (3.9) is defined in 
a bounded domain for a fixed aq £ S d . As such, the approximate solution Y 0p in 
any bounded subdomain of R. d can be obtained by solving the fully-discrete scheme 
(3.10) in a bounded subset of S[. We also observe from (3.10) that at each point 
(t n ,x i) £ 7 r x S d , the computation of Yf ;p only depends on X n+ \ and Y n+ i jP even 
though an implicit time-stepping scheme (6 > 0) is used. This means {Y r ' p j; e z 
on each time stage can be computed independently, so that the difficulty of solving 
linear systems with possibly dense matrices is completely avoided. Instead, Yf p can 
be either computed explicitly (in case / is linearly dependent on Y^ p ), or obtained by 
solving a nonlinear equation (in case / is nonlinear with respect to Y] p ). This feature 
makes it straightforward to develop massively parallel algorithms, which are, in terms 
of scalability, very similar to the CTRW method. Moreover, the scheme (3.10) is 
expected to outperform the CRTW method because it achieves high-order accuracy 
(discussed in §4) from the discretization and is also capable of solving problems that 
include a general nonlinear forcing term g. In fact, the combination of accurate 
approximations and scalable computations are the key advantages of our approach, 
compared to the existing methods for the nonlocal diffusion problem in (2.3). 

Remark 1. The nonlinear Feynman-Kac formula converted the effect of the 
nonlocal operator C in (2.1) to the behavior of the corresponding compound Poisson 
process, which is accurately approximated by the conditional mathematical expectations 
E* [•]. As such, there is no need to discretize the nonlocal operator in the fully- 
discrete scheme (3.10), so that stability does not require any CFL-type restriction 
on the time step; this is in contrast to classic numerical schemes for which explicit 
schemes do require a CFL condition. Moreover, according to the analysis in [20, 22], 
the approximation (3.10) is stable as long as the semi-discrete scheme is absolutely 
stable. 

Remark 2. The total computational cost of the scheme (3.10) is dominated by 
the cost of approximating E* 1 [ • ] at each grid point aq £ S d , using the formula in 
(3.9). For example, when solving a three-dimensional problem (d = 3) and retaining 
two Levy jumps (M y = Mf = 2,) our approach requires the approximation of multiple 
six-dimensional integrals. In this case, sparse-grid quadrature rules [19] can be used 
to alleviate the explosion in computational cost coming from the high-dimensional 
systems. 


4. Error estimates. In this section, we analyze both the senri-discrete and 
fully-discrete schemes, given by (3.7) and (3.10) respectively, for the BSDE in (2.12). 
Without loss of generality, we only consider the one-dimensional case (d = 1) and 
set 9 = \ ; the following analysis can be directly extended to multi-dimensional cases. 
For simplicity, we also assume T t and S] are both uniform grids, i.e., At = At n for 
n = 1,..., N and Aa: = aq — aq_i for i £ Z. 

Note that the well-posedness of the BSDE in (2.12) only requires global Lipchitz 
continuity on / with respect to X t and Y t . However, in order to obtain error estimates, 
we need to impose stronger regularity on /. To this end, we first introduce the 
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following notation: 


C b k (D x x • • • x Dj) 



a \ip 

dx a 


is bounded and continuous for 0 < |a| < k 


where a = (ai ,..., aj) with { aj}J =1 C N and |a| = a\ + ■ ■ ■ + aj 


where D\ x • • • x Dj C R J . Next we present the following lemma that relates the 
smoothness of f(t, x, y ) to the regularity of the solution u(t, x ) of the nonlocal diffusion 
equation in (2.3). 

Lemma 1. Under assumptions in (2.2) and (2.4), if ip(x) € C b (R), f(t,x,y) € 
C b ([0,T] x R 2 ) in (2.12) and 9a , ^^ a2 with a\ + < k is uniformly Lipschitz con¬ 

tinuous with respect to x and y, then the nonlocal diffusion equation in (2.3) with 
d = 1, uo(x) = <p'{x) and g(t,x,u) = f(T — t,x,u) has a unique viscosity solution 
«((i)€C t *([0,T]xl). 

Proof. Here we only prove the case of k = 1, as the cases k > 1 can be proved by 
recursively using the following argument. Based on the regularity of /, it is easy to 
see that ^ is bounded and continuous with respect to t, so that we focus on proving 
the regularity of jff. Let VXt, VYj and Vt/ t be the variations of the stochastic 
processes X t , Y t and U t with respect to X t in (2.12), respectively. Then the triple 
(VX t , VI*, Vf7 t ) is the solution of the following BSDE. 


r vi t =i, 

VYj =<p'(X T ) + J f x (s, X s , Y s ) + f y {8, X s ,Y s )XY s ds 


VC/s(e) jl(de, ds), 


J E 


(4.1) 


which is a linear equation with respect to VX S , S7Y s and Vf7 s . Due to the regularity 
of /, the BSDE in (4.1) has a unique solution (VY). VU t ) where VY t is bounded 
and continuous. By the relationship between BSDEs and PIDEs, we see that VY t = 
w(t,X t ) where w(t,x) is the unique viscosity solution of the following PIDE [1] 


— (t,x) +Cw(t,x) + f' x (t,x, v) + f' v (t,x,v)w = 0 (t,x) G [0, T] x R, 

w(T,x ) = ‘fi'(x), for 


(4.2) 


where v = u(T — t, x) is the transformed solution of the nonlocal problem in (2.3) 
with d = 1, uq = <p'(x) and g(t,x,u) = f(T — t,x,u). Then, by differentiating the 
equations in (2.3) with respect to x and comparing the resulting equations with (4.2), 
we can see that w(t,x ) = |^(T — t,x). Due to the continuity and boundedness of 
w(t, x), we have that u(t,x) € Ci([0, T] x R), which completes the proof. □ 

Next we provide the estimate of the local truncation error R n in (3.6). 

Theorem 1. Using Lemma 1 with k = 2, the local truncation error R n in (3.5), 
for the semi-discrete scheme (3.7) with 9 = \ can be bounded by 


E[|i?n|] < C'(At) 3 for n = 0,..., N - 1, 


(4.3) 
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where the constant C depends only on the terminal time T, the jump intensity A, the 
upper bounds of f, ip, and their derivatives. 

Proof. The formula (2.8) shows that the solution Y t of the BSDE (2.12) can be 
represented by Y t = u(t,X t ). Under Lemma 1, if / € C 2 ([0,T] x R 2 ) and ip g C 2 (R), 
then u(t,X t ) € C 2 ([0,T] x R). Thus, by defining F(t,X t ) = f (t,X t ,u(T — t,X t )), we 
have F € C 2 ([0,T] x R). Before estimating the residual R n in (3.5), we define the 
differential operators L° and L 1 by 


dF 


dF, 


L"F(t,X t )= — (t,X t ) + 6— (t,X t ) 


dt 


dx 


IE L 


dF, 


F(t, X t _ + e) - F(t, X t -) - X t _)e 


A (de), 


{ 1}F(t, X t ) = F(t, X t _ + e) - F(t, X t _). 


(4.4) 


Based on the definition of X s in (2.12), the integral form of the Ito formula of F(s, X s ) 
for t n < s < t n+ 1 , under the condition X tn = x, is given by 


F(s,X s ) = F(t n ,x ) + 


L°F(r, X r )dr 


X r _)ji(de, dr). (4.5) 


Thus, substituting the above formula into 


Jt n JE 

+1 E* n [F(s, X s )]ds, we obtain 


Ef [F(s, X s )]ds 


E? 


rtn+1 

=F(t n ,x) / ds + 


F{t n , x) + / L°F(r, X r )dr + / l}F{r,X r J)ji{de,dr) 
Jt n Jtn 

rtn+i ps 


ds 


=F(t„,x)At + 


rtn + 1 pS 


Ef 


E * [L a F(r,X r )]drds 


Lr F{t n ,x) + / LrLr F( Zl X z )dz 


(4.6) 


drds 


1 


=F(t n ,x)At + A (A t) 2 L°F(t n ,x) + 


rtn+i ps pr 


Ef [LrIrF(z, X z )\dzdrds. 


Similar derivation can be conducted to F(t n+1 , X t ) to obtain 
j^ +1 El[F(t n+u X tn+1 )]ds 

rtn +1 rt„+i rr 

=F(t n ,x)At + (At) 2 L°F(t n ,x) + J | J J Ef n [L°L°F(z,X z )]dzdrds. 

Therefore, the residual R n in (3.5) with 9 = A can be represented by 


(4.7) 


Rn = 


El[F(s,X s )} - -F(t n ,x) - -E l[F(t n+1 ,X tn+1 )\ds 


rtn+i ps pr 


EfjL^L^Fiz^^dzdrds 

/ t n Jt„ 

1 pt n + l ptn+1 pr 
1 ' • • -^?x [rO rO 


(4.8) 


E fjL"L"F{z,X z )]dzdrds. 
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Then, taking the mathematical expectation of R n , and using Cauchy-Schwarz inequal¬ 
ity, we have that 


E[|i?„|] < E 


rtn+i rs pr 


E ^[L°L {> F(z,X z )}dzdrds 


1 


E 


2 

< (At)* 


/*„ Jt n 

rt n +l rtn+1 pR 


Ej [L"L"F(z,X z )]dzdrds 


rtn+l PS pr 


E[\L°L°F(z, X z )\ 2 ]dzdrds 




rtn +1 ptn +1 pr 


E[\L°L°F(z,X z )\ 2 }dzdrds 


<* sup y/E[\L°L°F(t,X t )\*](At) 3 

* 0 <t<T 

< C(Atf, 


(4.9) 


where, by the definition of L°, it is easy to see that the constant C only depends 
on the terminal time T, the jump intensity A the upper bounds of /, ip and their 
derivatives. □ 

We now turn our attention to proving an error estimate for semi-discrete scheme 
(3.7). 

Theorem 2. Let Y tri and Y n , for n = 0, 1, • • • , N, be the exact solution of the 
BSDE (2.12) and the semi-discrete solution obtained by the scheme (3.7), respectively. 
Then, under the same conditions as that for Theorem 1, for sufficiently small time 
step At, the global truncation error e n = Y tn — Y n , for n = N — 1,..., 0, can be 
bounded by 

E[|F t „ -Y n \\ <C(At) 2 , (4.10) 


where the constant C is the same as in Theorem 1. 

Proof Given 0 < n < N — 1 and i£l, subtracting (3.7) from (3.5), we have 


e n — E? n [e n+ 1] + —Ef n [f(t n+ i,X tn+1 ,Y tn+1 ) - f(t n+ i,X n+ i,Y n+ i)\ 
4—^-[ f(t n , X tn , YtJ — f{t n , X n , Y n )] + R n 


(4.11) 


Due to the Lipschitz continuity of / and the fact that X tn+1 = X n+ i, X tn = X n = x, 
we get the bound 


\ e n\ < E? [|e n+ i|] + 


LAt 


Ef [| e n+l|] 


LAt , 


| Rn 


(4.12) 


where L is the Lipschitz constant of / and the time step size At satisfies 1 — > 0. 

Substituting the upper bound of R n , the above inequality can be rewritten as 



&n+ 11 ] + 


C(Af) 3 

-i LAt ' 
1 2 


(4.13) 
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Taking mathematical expectation on both sides, we obtain, by induction, an upper 
bound of E[|e„|], i.e., 


E[|e„|] < (1 + LAt) iV_ ”E[|ejv|] 

C(At) 2 


C(At) 


3 (1 + LAt) 


N-n 


- 1 


LAt 


< 


(e L1 1), 


(4.14) 


where the constant C is the same constant from (4.9). □ 

Next, we focus on analyzing the numerical error for the fully-discrete scheme 
(3.10). For 0 < n < N — 1, is constructed using piecewise p-tli order Lagrange 
polynomial interpolation on the mesh Sh- As for the quadrature rules involved in 
E*' i M [ • ], without loss of generality, we consider a special case stated in the following 
assumption. 

Assumption 1. For d = 1 and m = 1, ..., M y , the quadrature rules 
{a ^ ,m ,..., a™’ m }®A 1 used in (3.9) are assumed to be the tensor-product of the selected 
one-dimensional quadrature rule, denoted by {IF,-, OjljL 1 . The error in the quadrature 
rule is represented by 


IE J E 


^T n +1 4” ^ )(n^ efe )) de i • • • de. 


k=l 7 x /c=l 
Q Q / m \ / m 

- Z) ■ ■ ■ (ft ^(*< + J2' 1 

gi=i <? m =l k fc=1 / V fc=1 


(4.15) 


< CQ- r , 


where the convergence rate r > 0 and the constant C are determined by the smoothness 
ofY n .|_i and p. The same strategy is also applied to E^ M [/]. 

Then, with Assumption 1, the error estimate for the fully-discrete approximation 
(3.10) is given in the following theorem. 

Theorem 3. Let Y tri and Yf p , for ii = 0,l,...,JV,ieZ, be the exact solution of 
the BSDE (2.12) and the fully-discrete approximation obtained by the scheme (3.10) 
with 9 = respectively. Under Assumption 1 and the conditions in Theorems 1 and 
2, the error e l n = Yfff ~Yf p can be bounded by 

max |ejj < C [(At) 2 + (AA t) M * + (AA t) M f +1 + Q~ r + (Ax) p+1 l , (4.16) 


where the constant C is the same constant from (4.9). 

Proof. At each grid point a € Sh, substituting X tn = Xi into (3.5), we have 

Y t x f =EZ[Y tn+1 ] + ^E Z[f(t n+ i,X t ^,Y tn+1 )} + ^/(t„, a*, !£*) + *», (4.17) 

where Yf ‘ is the value of the exact solution at the grid point ( t n ,Xi ). Subtracting 
(4.17) from (3.10) leads to 

4 = E£[y t ] - E Z M \Y n+1 J + ^ {Ef* [f(t n+ i , X tn+1 , Y tn+1 )] 


2 l 


~ [f(tn+l, X n +1, ^n+l,p)]} + Y/f) — f(t n ,Xi,Yfl p )] + R 


= h + h 


R-r, 


(4.18) 
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where 


h = E tni Y tn + l] ~ E tl My i Y n + l,p\, 

^2 = X tn+11 Y tn+1 )] — E t ^ M ^ X n _|_i, y„ + i jP )] 

h = ^f[f(tn,Xi,Y t x n z ) - f{t„,Xi,Y^ p )]. 


(4.19) 


Due to the Lipschitz continuity of /, it is easy to see that 


|/ 3 |<CAt|4|, 


(4.20) 


where C depends on the Lipschitz constant of /. We can split I\ in the following way 


h =E£[r tB+1 ] -E** iM# [r tB+ 1 ]+E** iM# [y tB+1 ] -E^jy^j 


i'll 


Y [^*«+l ^tn + lip] d _ ®t’,M„[^tn+l,P Y n + l,p], 


(4.21) 


Jl3 




where p is the p-th order Lagrange polynomial interpolant of Y tri . By the 
definition of E^[-] and E M [■] in (3.8) and (3.9), we have 


iAii= E 


-AA t 


(aa ty 


m=M y +1 

< C(AAt) M " +1 , 


m! 


Je 


k =1 


■^Ti+i “1“ ^ y e k )(n^)) dei • • • de. 


k—1 


(4.22) 

for AA t < 1, where C depends on the upper bounds of |1* | and \p\. For the error 

/i 2 , we obtain 


I\2\ — 


K,Mj Y ^- E Z,M V ^n +l ] 


E e 

m=M y -\-1 


-aA t (AAt) m I r 

ml | J E J e 


m \ / m 


^71+1 (*<+E IIpM dei • • • de. 


k=i 7 x fc=i 

Q Q / m \ ✓ m 

- E - • E (n ^+E 


< E 

ra—My + 1 


91=1 g m = l x fc=l 

e -AA« (AAf) m _ 
to! 




fc=l 


<CXAtQ~ r 

(4.23) 

for A At < 1 where C depends on the upper bound of the derivatives of Y t . For /i 3 , 
based on the fact that Y trl+1 (xi) — Y tn+ltP (xi) = 0 for Xi € Sh and the classic error 
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bound of Lagrange interpolation, we have 


I i.i I = 


E: 


't n ,My [-^in+1 ^ t„ + 1 ,p 


M v 


< e XAt | Y tn+1 (xi) - Y tn+ltP (xi )| + J2 e 


_ AAt (AAtr 
to! 


Q 


Q 


e e n 

91 = 1 9m-= 1 V /C=l 

Ml 

-XAt 


W, 


Qk 


k =1 


^tn +1 ( ^ ] a qk ) ,p ( ^ 'y v 


x 9fc 


/c—1 


(aa ty 


771: 


< c(A X y +1 • £ 

m=l 

< CAAt(Aa;) p+1 , 

Also, for 7i4, we deduce that 

1-^141 — ^’,M,K, + i,p“ y ti+l,p] 

— e |A* n+lj p(a;i) — l A n+i, P (a;i)| + ^ [ e 

< (l + CAtA Ax ) max |4 +1 |, 


(4.24) 


My 


: (AAf) r 
to! 


■ aaa E- A ^ m jfi< +1 i 


iez 


(4 -2 5) 

where A Ax is the Lebesgue constant of the p-th order Lagrange interpolant under the 
mesh Sh- 

Combining In, Ii 2 ,Ii 3 , hi, we obtain the following upper bound of |/i|, i.e., 

|Ji| < C [(AAt) M » +1 + AA tQ~ r + XAt(Ax) p+1 ] + (l + CAtA Ax ) max |4 +1 |. (4.26) 

Similarly, we can obtain the following upper bound of |A 2 1, i.e., 


\h\ < CAt 


(AAt) M/+1 + XAtQ~ r + XAt(Ax) p+1 + (l + CAtA Ax ) max |e‘ 


n +11 


(4.27) 


Substituting (4.26), (4.27) and (4.20) into (4.18), we have 

(1 - CAt) max \e l n \ < (1 + CAt) max \e l n+1 \ 

+ C [(AA t) M y +1 + At(AAt) M ^ +1 ] + XAtQ~ r + CXAt(Ax) p+1 + C(At) 3 . 

(4.28) 

Therefore, for sufficiently small At, we obtain, by induction, 

max |4| < C [(At) 2 + (AAt) M » + (AA t) M f +1 + Q~ r + {Ax) p+1 ] . (4.29) 


□ 


5. Numerical examples. In this section, we report on the results of three one¬ 
dimensional numerical examples, that illustrate the accuracy and efficiency of the 
proposed schemes (3.10). We take uniform partitions in both temporal and spatial 
domains with time and space step sizes At and Ax, respectively. The number of 
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time steps is denoted by N, which is given by N = with T is the terminal time. 
For the sake of illustration, we only solve the nonlocal problems on bounded spatial 
domains. Lagrange interpolation and tensor-product quadrature rules are used to 
approximate E^’ : [ • ], where the one-dimensional quadrature rule for each example is 
chosen according to the property of the kernel 7 (e), the initial condition uq and the 
forcing term /. Our main goal is to test the accuracy and convergence rate of the 
proposed scheme (3.10) with respect to At and Ax. To this end, according to the 
analysis in Theorem 3, we always set the number of quadrature points to be sufficiently 
large so that the error contributed by the use of quadrature rules is too small to affect 
the convergence rate. 

5.1. Symmetric kernel. First we consider the following nonlocal diffusion prob¬ 
lem in [ 0 ,T]: 



where 5 > 0 which corresponds to the symmetric kernel 


7 (e) 


^ 3 , for e G [-5,5], 
0 , for e ^ [—5, 5]. 


(5.2) 


We choose the exact solution to be 

u(t, x ) = (-x 3 + x 2 ) exp ( - , 

so that the forcing term g is given by 

»(<•*) = —^ + (2*-f) e*P (-^), 

and the initial condition ip(x) can be determined from u accordingly. After converting 
the problem (5.1) to a BSDE of the form in (2.12), we have 

A= ^’ P( e ) = & = 0 ’ /(*>■>') = 9(T -t,-,-), 

where Zr_g t gi(e) is a characteristic function of the interval [—5, 5]. 

We set the terminal time T = 1 and solve (5.1) on the spatial domain [0,1] £ R. 
Since the density function p(e) is uniform with the support [—5,5], we use tensor- 
product Gauss-Legendre quadrature rules, with Q = 16, to approximate the integrals 
involved in E*‘ [ • ]. First, we test the convergence rate with respect to At. To this end, 
we set N x = 65 and use piecewise cubic Lagrange interpolation (p = 3) to construct 
Y ntP for n = 0,..., N — 1, so that the time discretization error dominates the total 
error. For 5 = 1, the number N of time steps is set to 4,8,16,32,64, respectively. 
The numerical results are shown in Table 5.1. As expected, the convergence rate with 
respect to At depends not only on the value of 9 , but also the number of jumps retained 
in constructing M [•] and E^ M [•]. For example, when M y = Mf = 0, i.e., no 
jump is included in E*^ M [•] and E*^ M [•], the scheme (3.10) fails to converge. In 
general, in order to achieve a fc-th order convergence (k = 1,2), M y and Mf must 
satisfy M y > k and Mf > k — 1. 
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Table 5.1: Errors and convergence rates with respect to At in Example 5.1 where 
T = l,S = l,N x = 65,p = 3. 



We observe in Theorem 3 that the errors with respect to At from E* * M [ ■ ] and 
E*^ Mfl'} are order (AA t) My and (AA t) Mf+1 , respectively. If the intensity A is 
large, i.e., the horizon 6 is small, the value of <5 will affect the error decay in the 
pre-asymptotic region and change the total error up to a constant. Such phenomenon 
is investigated by setting <5 = 0.3 and 0.4, and 0 = 1 and |. The results are shown in 
Figure 5.1. In Figure 5.1(a) with 6 = 1, in the case that S = 0.3 (A « 22), M y = 1 
and Mf = 0, the error is actually of order O(AAt). When one more jump is included, 
i.e., M y = 2 and Mf = 1, we observe that the convergence rate remains the same but 
the total error is significantly reduced, since the error contributed by E*' M [•] and 
E*)) M f [ ‘ ] i s reduced to G((AAt) 2 ). 



10° 10 1 10 2 10 3 10 4 10 5 10° io’ 10 2 10 3 io 4 10 5 

Number of time steps N Number of time steps N 


Fig. 5.1: Error decays for (a) 6 = 1 and (b) 9 = \ in Example 5.1. 


Next, we test the convergence rate with respect to Ax by setting S = 1, T = 1, 
N = 1024, M y = 3 and Mf = 2. We divide the spatial interval [0,1] into 
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N x elements with Ax = 2 -3 , 2 -4 , 2 -5 , 2~ 6 , 2~”‘ . The error is measured in L°° and 
L 2 norms. In Table 5.2, we can see that the numerical results verify the theoretical 
analysis in §4. 


Table 5.2: Errors and convergence rates with respect to Ax in Example 5.1 where 
S = 1, T = 1, e = N = 1024, My = 3 and M f = 2. 



Linear interpolation 

Quadratic interpolation 

Ax 

1 

0 

£ 

Pt 0 -^.lllioo 

11^*0-^0,211^ 

\\Y t0 -y 0 ,2|koo 

2 —3 

1.227E-03 

2.816E-03 

1.862E-04 

2.969E-04 

2 -4 

3.266E-04 

7.356E-04 

2.278E-05 

3.881E-05 

2 —5 

6.586E-05 

1.772E-04 

2.792E-06 

4.661E-06 

2-6 

1.651E-05 

4.481E-05 

3.491E-07 

6.252E-07 

2~ 7 

4.167E-06 

1.103E-05 

4.738E-08 

9.188E-08 

CR 

2.071 

2.001 

2.991 

2.927 


5.2. Singular kernel. We consider the following nonlocal diffusion problem in 




du 1 


i(t, x + e) — u(t , x) 


dt s^VsJs 7 R 

w( 0 ,x) = <p(x), 
where 5 > 0 which corresponds to the singular kernel 7 (e) 

-, for e G [—5, <5], 


de = g(t , x), t > 0 , 


(5.3) 


7(e) = &>/m' 

0 , for e$.[—5,S\. 


(5.4) 


We choose the exact solution to be 


u(t, x) = (x + t) 2 , 

so that the forcing term g is given by 

g{t,x) = 2{x + t) - \ 

5 

and the initial condition ip(x) can be determined from u accordingly. After converting 
the problem (5.1) to a BSDE of the form in (2.12), we have 

A= ^’ P(e) = b = 0, = g(T -t,-,-), 

where Ir_j ) {i(e) is a characteristic function of the interval [—<5, <5]. 

We set the terminal time T = 0.25 and solve the equation (5.3) on the spatial 
domain [0,1] € R. It is observed that the density function p(e) is singular at e = 0, 
but it is still integrable in the interval [—<5, <5]. Recalling that l/\/fe[ is a special case 
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of the kernel of the Gauss-Jacobi quadrature rule, i.e. / a b ^(a;)(6 — x) a (x— a)^dx with 
a,P € [—1,1], where the kernel is given by (b — x) a (x — a)^. Thus, the integrals 
involved in ; M [■] and E^ ; M [ ■ ] can be accurately approximated by setting a = 0, 
b = S, a = 0 and /3 = — In this example, we use 16-point Gauss-Jacobi rule such 
that the quadrature error can be ignored. We test the convergence rates with respect 
to At by setting 6 = 1, N x = 65 and p = 3, and the results are given in Table 
5.3. As expected, due to the use of Gauss-Jacobi rule, the temporal truncation error 
dominates the total error and the theoretical result in Theorem 3 has been verified as 
well. The convergence with respect to Aa; is also tested by setting p = 1, T = 0.25, 
9 = N = 1024, M y = 3 and Mf = 2. Table 5.4 shows the results in two cases 
where S = 1 and 0.1, respectively. We see that, for different horizon values, the spatial 
discretization error decays at the same rate verifying the theoretical error estimates 
in §4. 


Table 5.3: Errors and convergence rates with respect to At in Example 5.2 where 
T = 0.25, 6 = l,N x =65,p = 3. 


\\Yt 0 ^o,p||oo 


N = 4 

N = 8 

IV = 16 

N = 32 

IV = 64 

CR 

0 = 0, My = 0, Mf = 0 

9.732E-1 

9.554E-1 

9.461E-1 

9.413E-1 

9.390E-1 

0.013 

e = 0, My =3.1, Mf =0 

2.099E-1 

1.108E-1 

5.698E-2 

2.890E-2 

1.456E-2 

0.964 

0 = 0, My = 2, Mf = 1 

8.618E-2 

4.131E-2 

2.013E-2 

9.925E-3 

4.926E-3 

1.032 

0 = 1, My = 0, Mf = 0 

8.958E-1 

9.167E-1 

9.267E-1 

9.317E-1 

9.341E-1 

-0.014 

0 = 1, My = l,Mf = 0 

1.239E-1 

6.669E-2 

3.461E-2 

1.763E-2 

8.900E-3 

0.952 

0 = 1, My = 2,Mf = 1 

8.485E-2 

5.274E-2 

2.959E-2 

1.577E-2 

8.149E-3 

0.950 

0 = My = 0,Mf = 0 

9.345E-1 

9.360E-1 

9.364E-1 

9.365E-1 

9.365E-1 

-0.001 

0 = |, My = 1, Mf = 0 

1.669E-1 

8.874E-2 

4.579E-2 

2.327E-2 

1.173E-2 

0.959 

0 = |, My = 2, Mf = 1 

1.634E-2 

4.386E-3 

1.136E-3 

2.889E-4 

7.286E-5 

1.954 

0 = 1, My = 3,M f = 2 

4.476E-3 

1.079E-3 

2.634E-4 

6.498E-5 

1.613E-5 

2.029 


Table 5.4: Errors and convergence rates with respect to Ax in Example 5.2 where 
p=l,T = 0.25, 9 = 1/2, N = 1024, M y = 3 and M f = 2. 



<5 = 1 

8 = 0.1 

Ax 

71 

O 

1 

to 

II W, -Yo, p \\l°o 

II W, - Y 0jP \\ l; , 

II W, -Yo, p \\l°° 

2-3 

3.912E-03 

5.232E-03 

7.892E-03 

1.046E-02 

2-4 

9.934E-04 

1.318E-03 

2.030E-03 

2.618E-03 

2-5 

2.647E-04 

3.482E-04 

4.893E-04 

5.787E-04 

2-6 

6.062E-05 

8.057E-05 

1.269E-04 

1.180E-04 

2~ 7 

1.318E-05 

1.862E-05 

1.498E-05 

2.541E-05 

CR 

2.046 

2.030 

2.208 

2.184 
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5.3. Non-symmetric kernel and discontinuous solution. We consider the 
following nonlocal diffusion problem in [ 0 , T], 


(5.5) 


r du r 26 


\*-L 

u(t, x + e) — u(t, x) 

[ u(0,x) = ip{x), 


where 6 > 0 for which we have the non-symmetric kernel 

7 (e) = 

We choose the exact solution 


1, if e e [—(5,25], 
0, if e ^ [—(5, 26]. 


(5.6) 


i(t, x ) 


sin(t), if x < —, 
x 2 sin(f), if x > - 


(5.7) 


so that the forcing term g is given by 

{x + 2(5 ) 2 (x — 6 ) 2 


9(t,x) = < 


sin(f) 
sin (t) 

sin(f) 

sin(f) 


36x 


2 2 

1 (x + 2(5 ) 3 (x — (5 ) 2 

12 3 + 2 

1 (x + 2S) 3 (x — (5 ) 2 

12 3 + 2 

(x + 2(5 ) 3 (x — 6) 3 


■ COS (t), 


36x 


36x 2 


:cos (t), 


x < --26, 

- — 26 < x < -, 
2 ~ 2 


1 


1 


+ x 2 cos(i), - < x < - + 6. 


+ 


+ 3(5ar 


+ x 2 cos(f), x>\ + 5 


(5-S) 

and the initial condition tp(x) can be determined from u accordingly. After converting 
the problem (5.5) to a BSDE of the form in (2.12), we have 


A = 3(5, p(e) = ^T[s, 2 S\(e), b =^ 2 , 

Note that the kernel 7 (e) is non-symmetric so the drift coefficient b is non-zero. 
However, as shown in (2.11), non-zero b is equal to the compensator of the underlying 
compound Poisson process, such that it cancels out the compensator in the forward 
SDE in (2.12), so that it does not appear in the numerical scheme (3.10). The most 
challenging aspect of this problem is the discontinuities in u and g , which deteriorate 
the accuracy of the quadrature rule and the polynomial interpolation. Here, the 
trapezoidal quadrature rule so that the integral of the interpolating polynomials are 
evaluated exactly. To study the convergence of the scheme (3.10) with respect to 
Ax, we set T = 0.5, 9 = A and N = 512. First, we solve (5.5) on a uniform spatial 
grid with Ax = 2~ 3 , 2 -4 , 2 -5 , 2~ 6 , 2~ 7 . The errors and convergence rates are given in 
Table 5.5. As we expected, due to the discontinuity, the L°° error does not converge 
and the L 2 error converges with 0((Ax)?). 

In order to improve the convergence rate with respect to the L 2 norm, one strategy 
is to utilize adaptive grids which are capable of automatically refining the spatial grid 
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Table 5.5: Errors and convergence rates with respect to Ax in Example 5.3 for p = 1, 
T = 0.5, Q=\,N= 512. 



<5 = 1 

8 = 0.1 

Ax 

\\Yt 0 -Yo, P \\ L 2 

\\Yt 0 Yo,p\\loo 

IIH.0-5VIL2 

\\Yto Yo, p \\l~ 

2-3 

3.001E-02 

1.323E-01 

2.503E-02 

1.215E-01 

2 -4 

2.287E-02 

1.317E-01 

1.751E-02 

1.207E-01 

2- 5 

1.467E-02 

1.257E-01 

1.231E-02 

1.201E-01 

2-6 

1.107E-02 

1.254E-01 

8.676E-03 

1.197E-01 

2“ 7 

7.045E-03 

1.221E-01 

6.126E-03 

1.192E-01 

CR 

0.523 

0.030 

0.507 

0.007 


around the discontinuity. Here, we apply the one-dimensional adaptive hierarchical 
finite element method [12, 19]. This approach can be seamlessly integrated into the 
sparse grid framework to alleviate the curse of dimensionality when solving high¬ 
dimensional nonlocal diffusion problems. Figure 5.2(a) shows the solution surface 
u{t, x) we recovered in order to approximate u(0.5, x) within the spatial domain [0,1]; 
and Figure 5.2(b) illustrates the exact solution w(0.5,x) and its approximation using 
33 spatial grid points for which the L 2 error is only 9.745E-04. Further illustration 
is shown in Figure 5.3(a) and Figure 5.3(b) where, for the scheme (3.10), the error 
decay vs. the number of interpolating points and vs. the grid size Ax , respectively, 
of using uniform and adaptive grids are plotted. For the adaptive case, Ax is the 
starting grid for the refinement process. We clearly see the half-order and the optimal 
second-order convergence rates for the uniform and adaptive grid cases, respectively. 
We also study the convergence with respect to At for 9 = 1 with the error tolerance 
for the adaptive grid is set to 0.01, 0.005, 0.001 and 0.0005. In Figure 5.3(c), it can 
be seen that the desired convergence rate with respect to At is achieved if the error 
is smaller than the tolerance; otherwise, the spatial discretization error will dominate 
and the total error will not decay as At decreases. 

Of course, no amount of refinement, whether adaptive or not, will reduce the 
L°° norm error, i.e., it will remain 0(1). However, if we omit the single element 
containing the discontinuity, we see the following from Figure 5.3(d). For uniform 
grid refinement, the error remains 0(1) due to pollution of the error into neighboring 
elements which are all of size Ax. However, for adaptive grid refinement, the L°° 
error is close to (D(h 2 ) because pollution is to neighboring elements of very small size. 

6. Concluding remarks and future work. In this work, we propose a novel 
stochastic numerical scheme for linear nonlocal diffusion problems based on the re¬ 
lationship between the PIDEs and a certain class of backward stochastic differential 
equations (BSDEs) with jumps. As discussed in §1, this effort focuses on solving low¬ 
dimensional nonlocal diffusion problems with high-order accuracy, where the potential 
applications will be non-Darcy flow in groundwater, or plasma transport in magnet¬ 
ically confined fields. However, our approach can be extended to solve moderately 
high-dimensional problems by incorporating our previous work on sparse-grid approx¬ 
imation [19, 12]. Compared to standard finite element and collocation approaches for 
approximating linear nonlocal diffusion equations with integrable kernels, our method 
completely avoids the solution of dense linear systems. Moreover, the ability to utilize 
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(a) (b) 




Fig. 5.2: (a) The surface of u(t,x)\ (b) The exact solution w(0.5,x) (solid line) and 
its approximation (dashed line) using 33 grid points (red dots). 
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Fig. 5.3: Convergence of the L 2 norm of the error with respect to (a) the number of 
spatial grid points and (b) the spatial grid size Ax using uniform and adaptive grids, 
(c) Convergence with respect to At for various tolerances of the adaptive spatial 
grid, (d) Convergence of the L°° norm of the error (with the element containing the 
discontinuity ignored) with respect to Ax using uniform and adaptive grids. 


high-order temporal and spatial discretization schemes, as well as efficient adaptive 
approximation, and the potential of massively parallel implementation, make our 
technique highly advantageous. These assets have been verified by both theoretical 
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analysis and numerical experiments. 

Our future efforts will focus on extending the proposed numerical schemes to the 
case of non-integrable kernels, e.g., fractional Laplacians, where the underlying Levy 
processes cannot be described by compound Poisson processes. Moreover, the well- 
posedness of BSDEs on bounded domains, with volume constraints, is also critical for 
studying nonlocal diffusion equations on bounded domains. 
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